Crowdsourcing algorithms for entity resolution
نویسندگان
چکیده
منابع مشابه
Crowdsourcing Algorithms for Entity Resolution
In this paper, we study a hybrid human-machine approach for solving the problem of Entity Resolution (ER). The goal of ER is to identify all records in a database that refer to the same underlying entity, and are therefore duplicates of each other. Our input is a graph over all the records in a database, where each edge has a probability denoting our prior belief (based on Machine Learning mode...
متن کاملCrowdER: Crowdsourcing Entity Resolution
Entity resolution is central to data integration and data cleaning. Algorithmic approaches have been improving in quality, but remain far from perfect. Crowdsourcing platforms offer a more accurate but expensive (and slow) way to bring human insight into the process. Previous work has proposed batching verification tasks for presentation to human workers but even with batching, a human-only app...
متن کاملErrata for "Crowdsourcing Algorithms for Entity Resolution" (PVLDB 7(12): 1071-1082)
Recall #Questions dec rand node Figure 1: Corrected Figure 12(b): #Questions vs Recall for Places We discovered that there was a duplicate figure in our paper. We accidentally put Figure 13(b) for Figure 12(b). We have provided the correct Figure 12(b) above (See Figure 1). Figure 1 plots the recall of various strategies as a function of the number of questions asked for Places dataset. There w...
متن کاملCrowdsourcing Entity Resolution: a Short Overview and Open Issues
Entity resolution (ER) is a process to identify records that stand for the same real-world entity. Although automatic algorithms aiming at solving this problem have been developed for many years, their accuracy remains far from perfect. Crowdsourcing is a technology currently investigated, which leverages the crowd to solicit contributions to complete certain tasks via crowdsourced marketplaces...
متن کاملThe Effect of Transitive Closure on the Calibration of Logistic Regression for Entity Resolution
This paper describes a series of experiments in using logistic regression machine learning as a method for entity resolution. From these experiments the authors concluded that when a supervised ML algorithm is trained to classify a pair of entity references as linked or not linked pair, the evaluation of the model’s performance should take into account the transitive closure of its pairwise lin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the VLDB Endowment
سال: 2014
ISSN: 2150-8097
DOI: 10.14778/2732977.2732982